A Parallel System for Textual

نویسندگان

  • Sanda M. Harabagiu
  • Dan Moldovan
چکیده

| This paper presents a possible solution for the text inference problem-extracting information unstated in a text, but implied. Text inference is central to natural language applications such as information extraction and dissemination , text understanding, summarization, and translation. Our solution takes advantage of a semantic English dictionary available in electronic form that provides the basis for the development of a large linguistic knowledge base. The inference algorithm consists of a set of highly parallel search methods that when applied to the knowledge base nd contexts in which sentences are interpreted. These contexts reveal information relevant to the text. Implementation, results and parallelism analysis are discussed. T HIS paper addresses the issue of parallelism in a class of problems that is largely unexplored, yet of growing importance. Text inference refers to the problem of extracting information that is not stated directly in a text, but is implied. This may be achieved by reasoning about a text by making logical judgments on the basis of circum-stantial evidence from a large knowledge base that contains knowledge about the world. A related, but much simpler problem is information retrieval where the goal is the recognition of facts, events and properties that are explicitly stated in the text. While current information retrieval systems that process millions of sentences per minute with an accuracy close to that of humans have been built 25], the process of large scale inference has not been automated yet. The major obstacles that need to be resolved are: (1) building knowledge bases large enough to capture world knowledge, (2) nd-ing a knowledge representation scheme good for common sense reasoning, and (3) developing inference methods and control mechanisms able to provide relevant inferences at speeds comparable to humans. In this paper we present a parallel inference system that operates on a very large linguistic knowledge base. The system is scalable both in size and accuracy and is highly parallel. The novelty of this work derives from our use of an extended linguistic knowledge base for English language called WordNet, and an inference algorithm that consists S. Harabagiu is with the and reference IEEECS Log Number D96261. of a set of parallel search procedures over the linguistic semantic network (i.e. the knowledge base). WordNet is being developed at Princeton by a group led by Miller 17]. Text inference is of great importance especially today when there are many newspapers, books and other …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Parallel Simulated Annealing, Parallel Vibrating Damp Optimization and Genetic Algorithm for Joint Redundancy-Availability Problems in a Series-Parallel System with Multi-State Components

In this paper, we study different methods of solving joint redundancy-availability optimization for series-parallel systems with multi-state components. We analyzed various effective factors on system availability in order to determine the optimum number and version of components in each sub-system and consider the effects of improving failure rates of each component in each sub-system and impr...

متن کامل

The Effect of Visual Representation, Textual Representation, and Glossing on Second Language Vocabulary Learning

In this study, the researcher chose three different vocabulary techniques (Visual Representation, Textual Enhancement, and Glossing) and compared them with traditional method of teaching vocabulary. 80 advanced EFL Learners were assigned as four intact groups (three experimental and one control group) through using a proficiency test and a vocabulary test as a pre-test. In the visual group, stu...

متن کامل

A Visual Approach for Developing, Understanding and Analyzing Parallel Programs

Programming languages based solely upon plain textual representations inherit the textual linearization drawback. When writing parallel programs this shortcoming places an additional burden on the programmer because the most interesting parts of a parallel system are not linear. The bunch of existing visualization systems justifies the overall need for graphical methods in parallel programming....

متن کامل

A Preliminary Study of Finding Entailing Texts in a Domain-specific Monolingual Parallel Corpora

This paper introduces the possible usages, benefits, and challenges involved in the use of domain-specific monolingual parallel corpora in determining textual entailment (TE). A system that finds entailing text for a given statement is to be developed using monolingual parallel translations of the Bible as corpus as this is one of the most accessible monolingual parallel corpora. Different exis...

متن کامل

Active Suspension System in Parallel Hybrid Electric Vehicles

In previous studies, active suspension system in conventional powertrain systems was investigated. This paper presents the application of active suspension system in parallel hybrid electric vehicles as a novel idea. The main motivation for this study is investigation of the potential advantages of this application over the conventional one. For this purpose, a simultaneous simulation is develo...

متن کامل

استخراج پیکره‌ موازی از اسناد قابل‌مقایسه برای بهبود کیفیت ترجمه در سیستم‌های ترجمه ماشینی

Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999